Multimodal Translation System Using Texture-Mapped Lip-Sync Images for Video Mail and Automatic Dubbing Applications

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimodal Translation System Using Texture-Mapped Lip-Sync Images for Video Mail and Automatic Dubbing Applications

We introduce a multimodal English-to-Japanese and Japanese-to-English translation system that also translates the speaker’s speech motion by synchronizing it to the translated speech. This system also introduces both a face synthesis technique that can generate any viseme lip shape and a face tracking technique that can estimate the original position and rotation of a speaker’s face in an image...

متن کامل

A REAL−TIME LIP SYNC SYSTEM USING A GENETIC ALGORITHM FOR AUTOMATIC NEURAL NETWORK CONFIGURATION (ThuAmSS2)

In this paper we present a new method for mapping natural speech to lip shape animation in real time. The speech signal, represented by MFCC vectors, is classified into viseme classes using neural networks. The topology of neural networks is automatically configured using genetic algorithms. This eliminates the need for tedious manual neural network design by trial and error and considerably im...

متن کامل

Automatic generation of dubbing video slides for mobile wireless environment

Mobile wireless video delivery is still challenging due to its limited bandwidth and dynamic channel status. In this paper, a novel approach named Dubbing Video Slides (DVS) is proposed to cope with the bandwidth limitation problem. Based on a statistical video content importance analysis, DVS method can dynamically select and transmit representative video frames which are relatively more impor...

متن کامل

Speaker independence in automated lip-sync for audio-video communication

By analyzing the absolute value of the Fourier transform of a speaker’s voice signal we can predict the position of the mouth for English vowel sounds. This is without the use of text, speech recognition or mechanical or other sensing devices attached to the speaker’s mouth. This capability can reduce the time required for mouth animation considerably. We expect it to be competitive eventually ...

متن کامل

Multimodal speaker/speech recognition using lip motion, lip texture and audio

We present a new multimodal speaker/speech recognition system that integrates audio, lip texture and lip motion modalities. Fusion of audio and face texture modalities has been investigated in the literature before. The emphasis of this work is to investigate the benefits of inclusion of lip motion modality for two distinct cases: speaker and speech recognition. The audio modality is represente...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: EURASIP Journal on Advances in Signal Processing

سال: 2004

ISSN: 1687-6172,1687-6180

DOI: 10.1155/s1110865704404259